AITopics | neural cleanse

Collaborating Authors

neural cleanse

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

234e691320c0ad5b45ee3c96d0d7b8f8-Paper.pdf

Neural Information Processing SystemsFeb-7-2026, 19:45:29 GMT

artificial intelligence, data mining, machine learning, (20 more...)

Neural Information Processing Systems

Country:

North America > Canada (0.04)
Asia > Vietnam > Hanoi > Hanoi (0.04)
Asia > Nepal (0.04)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)
Information Technology > Sensing and Signal Processing > Image Processing (0.70)
Information Technology > Data Science > Data Mining (0.68)

Add feedback

Quantization Blindspots: How Model Compression Breaks Backdoor Defenses

Pandey, Rohan, Ye, Eric

arXiv.org Artificial IntelligenceDec-9-2025

Backdoor attacks embed input-dependent malicious behavior into neural networks while preserving high clean accuracy, making them a persistent threat for deployed ML systems. At the same time, real-world deployments almost never serve full-precision models: post-training quantization to INT8 or lower precision is now standard practice for reducing memory and latency. This work asks a simple question: how do existing backdoor defenses behave under standard quantization pipelines? We conduct a systematic empirical study of five representative defenses across three precision settings (FP32, INT8 dynamic, INT4 simulated) and two standard vision benchmarks using a canonical BadNet attack. We observe that INT8 quantization reduces the detection rate of all evaluated defenses to 0% while leaving attack success rates above 99%. For INT4, we find a pronounced dataset dependence: Neural Cleanse remains effective on GTSRB but fails on CIFAR-10, even though backdoors continue to survive quantization with attack success rates above 90%. Our results expose a mismatch between how defenses are commonly evaluated (on FP32 models) and how models are actually deployed (in quantized form), and they highlight quantization robustness as a necessary axis in future evaluations and designs of backdoor defenses.

data mining, machine learning, quantization, (20 more...)

arXiv.org Artificial Intelligence

2512.06243

Country:

North America > Canada > Ontario > Toronto (0.14)
Asia > Nepal (0.04)

Genre: Research Report > New Finding (0.66)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Data Science > Data Mining (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

234e691320c0ad5b45ee3c96d0d7b8f8-Paper.pdf

Neural Information Processing SystemsOct-2-2025, 11:27:30 GMT

artificial intelligence, data mining, machine learning, (20 more...)

Neural Information Processing Systems

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Asia > Vietnam > Hanoi > Hanoi (0.04)
Asia > Nepal (0.04)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)
Information Technology > Sensing and Signal Processing > Image Processing (0.70)
Information Technology > Data Science > Data Mining (0.68)

Add feedback

Evolutionary Trigger Detection and Lightweight Model Repair Based Backdoor Defense

Zhou, Qi, Ye, Zipeng, Tang, Yubo, Luo, Wenjian, Shi, Yuhui, Jia, Yan

arXiv.org Artificial IntelligenceJul-14-2024

Deep Neural Networks (DNNs) have been widely used in many areas such as autonomous driving and face recognition. However, DNN model is fragile to backdoor attack. A backdoor in the DNN model can be activated by a poisoned input with trigger and leads to wrong prediction, which causes serious security issues in applications. It is challenging for current defenses to eliminate the backdoor effectively with limited computing resources, especially when the sizes and numbers of the triggers are variable as in the physical world. We propose an efficient backdoor defense based on evolutionary trigger detection and lightweight model repair. In the first phase of our method, CAM-focus Evolutionary Trigger Filter (CETF) is proposed for trigger detection. CETF is an effective sample-preprocessing based method with the evolutionary algorithm, and our experimental results show that CETF not only distinguishes the images with triggers accurately from the clean images, but also can be widely used in practice for its simplicity and stability in different backdoor attack situations. In the second phase of our method, we leverage several lightweight unlearning methods with the trigger detected by CETF for model repair, which also constructively demonstrate the underlying correlation of the backdoor with Batch Normalization layers. Source code will be published after accepted.

backdoor attack, bn layer, cetf, (16 more...)

arXiv.org Artificial Intelligence

2407.05396

Country:

Asia > China > Guangdong Province > Shenzhen (0.05)
Asia > Nepal (0.04)
Asia > China > Heilongjiang Province > Harbin (0.04)

Genre: Research Report > New Finding (0.66)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
(2 more...)

Add feedback

LSP Framework: A Compensatory Model for Defeating Trigger Reverse Engineering via Label Smoothing Poisoning

Li, Beichen, Guo, Yuanfang, Peng, Heqi, Li, Yangxi, Wang, Yunhong

arXiv.org Artificial IntelligenceApr-19-2024

Deep neural networks are vulnerable to backdoor attacks. Among the existing backdoor defense methods, trigger reverse engineering based approaches, which reconstruct the backdoor triggers via optimizations, are the most versatile and effective ones compared to other types of methods. In this paper, we summarize and construct a generic paradigm for the typical trigger reverse engineering process. Based on this paradigm, we propose a new perspective to defeat trigger reverse engineering by manipulating the classification confidence of backdoor samples. To determine the specific modifications of classification confidence, we propose a compensatory model to compute the lower bound of the modification. With proper modifications, the backdoor attack can easily bypass the trigger reverse engineering based methods. To achieve this objective, we propose a Label Smoothing Poisoning (LSP) framework, which leverages label smoothing to specifically manipulate the classification confidences of backdoor samples. Extensive experiments demonstrate that the proposed work can defeat the state-of-the-art trigger reverse engineering based methods, and possess good compatibility with a variety of existing backdoor attacks.

backdoor attack, neural cleanse, target class, (14 more...)

arXiv.org Artificial Intelligence

2404.12852

Country: Asia > China (0.04)

Genre: Research Report (0.82)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

DFB: A Data-Free, Low-Budget, and High-Efficacy Clean-Label Backdoor Attack

Ma, Binhao, Wang, Jiahui, Wang, Dejun, Meng, Bo

arXiv.org Artificial IntelligenceJan-17-2024

In the domain of backdoor attacks, accurate labeling of injected data is essential for evading rudimentary detection mechanisms. This imperative has catalyzed the development of clean-label attacks, which are notably more elusive as they preserve the original labels of the injected data. Current clean-label attack methodologies primarily depend on extensive knowledge of the training dataset. However, practically, such comprehensive dataset access is often unattainable, given that training datasets are typically compiled from various independent sources. Departing from conventional clean-label attack methodologies, our research introduces DFB, a data-free, low-budget, and high-efficacy clean-label backdoor Attack. DFB is unique in its independence from training data access, requiring solely the knowledge of a specific target class. Tested on CIFAR10, Tiny-ImageNet, and TSRD, DFB demonstrates remarkable efficacy with minimal poisoning rates of just 0.1%, 0.025%, and 0.4%, respectively. These rates are significantly lower than those required by existing methods such as LC, HTBA, BadNets, and Blend, yet DFB achieves superior attack success rates. Furthermore, our findings reveal that DFB poses a formidable challenge to four established backdoor defense algorithms, indicating its potential as a robust tool in advanced clean-label attack strategies.

backdoor attack, dataset, neural network, (14 more...)

arXiv.org Artificial Intelligence

2308.09487

Country:

Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Asia > Taiwan (0.04)
Asia > Nepal (0.04)

Genre: Research Report > New Finding (0.48)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Pick your Poison: Undetectability versus Robustness in Data Poisoning Attacks

Lukas, Nils, Kerschbaum, Florian

arXiv.org Artificial IntelligenceJun-29-2023

Deep image classification models trained on vast amounts of web-scraped data are susceptible to data poisoning - a mechanism for backdooring models. A small number of poisoned samples seen during training can severely undermine a model's integrity during inference. Existing work considers an effective defense as one that either (i) restores a model's integrity through repair or (ii) detects an attack. We argue that this approach overlooks a crucial trade-off: Attackers can increase robustness at the expense of detectability (over-poisoning) or decrease detectability at the cost of robustness (under-poisoning). In practice, attacks should remain both undetectable and robust. Detectable but robust attacks draw human attention and rigorous model evaluation or cause the model to be re-trained or discarded. In contrast, attacks that are undetectable but lack robustness can be repaired with minimal impact on model accuracy. Our research points to intrinsic flaws in current attack evaluation methods and raises the bar for all data poisoning attackers who must delicately balance this trade-off to remain robust and undetectable. To demonstrate the existence of more potent defenders, we propose defenses designed to (i) detect or (ii) repair poisoned models using a limited amount of trusted image-label pairs. Our results show that an attacker who needs to be robust and undetectable is substantially less threatening. Our defenses mitigate all tested attacks with a maximum accuracy decline of 2% using only 1% of clean data on CIFAR-10 and 2.5% on ImageNet. We demonstrate the scalability of our defenses by evaluating large vision-language models, such as CLIP. Attackers who can manipulate the model's parameters pose an elevated risk as they can achieve higher robustness at low detectability compared to data poisoning attackers.

detectability, imagenet, robustness, (14 more...)

arXiv.org Artificial Intelligence

2305.09671

Country:

North America > Canada > Ontario > Toronto (0.04)
Europe > Greece (0.04)
Asia > Nepal (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
(2 more...)

Add feedback

Towards Effective and Robust Neural Trojan Defenses via Input Filtering

Do, Kien, Harikumar, Haripriya, Le, Hung, Nguyen, Dung, Tran, Truyen, Rana, Santu, Nguyen, Dang, Susilo, Willy, Venkatesh, Svetha

arXiv.org Artificial IntelligenceFeb-14-2023

Trojan attacks on deep neural networks are both dangerous and surreptitious. Over the past few years, Trojan attacks have advanced from using only a single input-agnostic trigger and targeting only one class to using multiple, input-specific triggers and targeting multiple classes. However, Trojan defenses have not caught up with this development. Most defense methods still make inadequate assumptions about Trojan triggers and target classes, thus, can be easily circumvented by modern Trojan attacks. To deal with this problem, we propose two novel "filtering" defenses called Variational Input Filtering (VIF) and Adversarial Input Filtering (AIF) which leverage lossy data compression and adversarial learning respectively to effectively purify potential Trojan triggers in the input at run time without making assumptions about the number of triggers/target classes or the input dependence property of triggers. In addition, we introduce a new defense mechanism called "Filtering-then-Contrasting" (FtC) which helps avoid the drop in classification accuracy on clean data caused by "filtering", and combine it with VIF/AIF to derive new defenses of this kind. Extensive experimental results and ablation studies show that our proposed defenses significantly outperform well-known baseline defenses in mitigating five advanced Trojan attacks including two recent state-of-the-art while being quite robust to small amounts of training data and large-norm triggers.

accuracy, artificial intelligence, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2202.12154

Country:

Oceania > Australia > New South Wales > Wollongong (0.04)
Asia > Nepal (0.04)

Genre: Research Report (0.64)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (0.67)
Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.89)

Add feedback

CASSOCK: Viable Backdoor Attacks against DNN in The Wall of Source-Specific Backdoor Defences

Wang, Shang, Gao, Yansong, Fu, Anmin, Zhang, Zhi, Zhang, Yuqing, Susilo, Willy, Liu, Dongxi

arXiv.org Artificial IntelligenceDec-18-2022

As a critical threat to deep neural networks (DNNs), backdoor attacks can be categorized into two types, i.e., source-agnostic backdoor attacks (SABAs) and source-specific backdoor attacks (SSBAs). Compared to traditional SABAs, SSBAs are more advanced in that they have superior stealthier in bypassing mainstream countermeasures that are effective against SABAs. Nonetheless, existing SSBAs suffer from two major limitations. First, they can hardly achieve a good trade-off between ASR (attack success rate) and FPR (false positive rate). Besides, they can be effectively detected by the state-of-the-art (SOTA) countermeasures (e.g., SCAn). To address the limitations above, we propose a new class of viable source-specific backdoor attacks, coined as CASSOCK. Our key insight is that trigger designs when creating poisoned data and cover data in SSBAs play a crucial role in demonstrating a viable source-specific attack, which has not been considered by existing SSBAs. With this insight, we focus on trigger transparency and content when crafting triggers for poisoned dataset where a sample has an attacker-targeted label and cover dataset where a sample has a ground-truth label. Specifically, we implement $CASSOCK_{Trans}$ and $CASSOCK_{Cont}$. While both they are orthogonal, they are complementary to each other, generating a more powerful attack, called $CASSOCK_{Comp}$, with further improved attack performance and stealthiness. We perform a comprehensive evaluation of the three $CASSOCK$-based attacks on four popular datasets and three SOTA defenses. Compared with a representative SSBA as a baseline ($SSBA_{Base}$), $CASSOCK$-based attacks have significantly advanced the attack performance, i.e., higher ASR and lower FPR with comparable CDA (clean data accuracy). Besides, $CASSOCK$-based attacks have effectively bypassed the SOTA defenses, and $SSBA_{Base}$ cannot.

artificial intelligence, cover sample, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2206.00145

Country:

North America > Canada > Ontario > Toronto (0.14)
Asia > Nepal (0.04)
Asia > China > Jiangsu Province > Nanjing (0.04)
(4 more...)

Genre: Research Report (0.50)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.88)

Add feedback

Backdoor Mitigation in Deep Neural Networks via Strategic Retraining

Dhonthi, Akshay, Hahn, Ernst Moritz, Hashemi, Vahid

arXiv.org Artificial IntelligenceDec-14-2022

Deep Neural Networks (DNN) are becoming increasingly more important in assisted and automated driving. Using such entities which are obtained using machine learning is inevitable: tasks such as recognizing traffic signs cannot be developed reasonably using traditional software development methods. DNN however do have the problem that they are mostly black boxes and therefore hard to understand and debug. One particular problem is that they are prone to hidden backdoors. This means that the DNN misclassifies its input, because it considers properties that should not be decisive for the output. Backdoors may either be introduced by malicious attackers or by inappropriate training. In any case, detecting and removing them is important in the automotive area, as they might lead to safety violations with potentially severe consequences. In this paper, we introduce a novel method to remove backdoors. Our method works for both intentional as well as unintentional backdoors. We also do not require prior knowledge about the shape or distribution of backdoors. Experimental evidence shows that our method performs well on several medium-sized examples.

artificial intelligence, backdoor, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2212.07278

Country:

Europe > Netherlands (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Ingolstadt (0.04)
Europe > France > Auvergne-Rhône-Alpes > Isère > Grenoble (0.04)
Asia > Nepal (0.04)

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Transportation > Ground > Road (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.73)

Add feedback